Discovering patterns in visual speech

نویسنده

  • Stephen Cox
چکیده

We know that an audio speech signal can be unambiguously decoded by any native speaker of the language it is uttered in, provided that it meets some quality conditions. But we do not know if this is the case with visual speech, because the process of lipreading is rather mysterious and seems to rely heavily on the use of context and non-speech cues. How much information about the speech content is there in a visual speech signal? We make some attempt to provide an answer to this question by ‘discovering’ matching segments of phoneme sequences that represent recurring words and phrases in audio and visual representations of the same speech. We use a modified version of the technique of segmental dynamic programming that was introduced by Park and Glass. Comparison of the results shows that visual speech displays rather less matching content than the audio, and reveals some interesting differences in the phonetic content of the information recovered by the two modalities.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative Effect of Visual and Auditory Teaching Techniques on Retention of Word Stress patterns: A Case Study of English as a Foreign Language Curriculum in Iran

This study aimed at investigating the effect of visual (Cuisenaire Rods) and auditory nonsensical monosyllables using Pratt speech processing software as teaching techniques on retention of word stress. To this end, 60 high school participants made the two experimental groups of the study each having 30 students on the basis of their proficiency scores on KET (Key English Test). In one experime...

متن کامل

Question Type Classification Using a Part-of-Speech Hierarchy

Question type (or answer type) classification is the task of determining the correct type of the answer expected to a given query. This is often done by defining or discovering syntactic patterns that represent the structure of typical queries of each type, and classify a given query according to which pattern they satisfy. In this paper, we combine the idea of using informer spans as patterns ...

متن کامل

Requestive Speech Acts Realization Patterns: Observation from Persian

Without knowing the speech act functions, it would be difficult to make correct requests in a language. Studies in pragmalinguistics have shown that conventionally direct and indirect requestive patterns are perceived differently in different speech communities. This study investigates the perception of the requestive speech acts by Persian native speakers to determine the socially appropriate ...

متن کامل

A Learning Algorithm for Question Type Classification

Question type (or answer type) classification is the task of determining the correct type of the answer expected to a given query. This is often done by defining or discovering syntactic patterns that represent the structure of typical queries of each type, and classify a given query according to which pattern they satisfy. In this paper, we combine the idea of using informer spans as patterns ...

متن کامل

Discovering Convolutive Speech Phones using Sparseness and Non-Negativity Constraints

Discovering a representation that allows auditory data to be parsimoniously represented is useful for many machine learning and signal processing tasks. Such a representation can be constructed by Nonnegative Matrix Factorisation (NMF), which is a method for finding parts-based representations of non-negative data. Here, we present an extension to convolutive NMF that includes a sparseness cons...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015